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The Gaussian Channel with Noisy Feedback: 
Improving Reliability via Interaction 

Assaf Ben-Yishai and Ofer Shayevitz 


Abstract —Consider a pair of terminals connected by two 
independent (feedforward and feedback) Additive White Gaussian 
Noise (AWGN) channels, and limited by individual power con¬ 
straints. The first terminal wonld like to reliably send information 
to the second terminal at a given rate. While the reliability in the 
cases of no feedback and of noiseless feedback is well studied, not 
mnch is known about the case of noisy feedback. In this work, 
we present an interactive scheme that significantly improves 
the reliability relative to the no-feedback setting, whenever the 
feedback Signal to Noise Ratio (SNR) is sufficiently larger than 
the feedforward SNR. The scheme combines Schalkwijk-Kailath 
(S-K) coding and modnlo-lattice analog transmission. 

I. Introduction 

Feedback cannot improve the capacity of point-to-point 
memoryless channels m . Nevertheless, noiseless feedback can 
significantly simplify the transmission schemes and improve 
the error probability performance, see e.g. El-llll. These 
elegant schemes fail however in the presence of arbitrarily 
small feedback noise, rendering them grossly impractical. This 
fact has been initially obseved in ||2l for the AWGN channel, 
and further strengthened in Q. 

In a previous work H we presented a variation of the 
noiseless-feedback AWGN S-K scheme 0, extending it to 
the case of noisy feedback. The scheme was based on the 
following observation: In each round, the receiver has some 
estimate of the message, and the transmitter needs to learn the 
associated estimation error in order to proceed. This estimation 
error can be conveyed in a power-efficient manner by using 
the knowledge of the message at the transmitter as side- 
information. The main focus of 0 was on the simplicity 
of the scheme in a fixed error probability regime, and side 
information was used by applying scalar modulo operations. 
This resulted in a major improvement of the capacity-gap in 
a relatively small number of rounds. 

The focus of this work is on the virtues of noisy feedback 
for increasing reliability. To that end, an asymptotic general¬ 
ization of the scheme in 0 is introduced, applying the S- 
K scheme over blocks and replacing the scalar modulo with 
multi-dimensional lattice modulo, as well as replacing Pulse 
Amplitude Modulation (PAM) used in 0 with a block code. 
An asymptotic error analysis is provided, using the Poltyrev 
exponent to account for modulo aliasing errors, and channel 
coding error exponents to account for the error of the block 
code. The resulting error exponent is computed and shown to 
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surpass the sphere-packing bound of the feedforward channel 
for a wide range of rates and SNR settings. 

In Q, Gl, the authors analyzed the reliability function of 
the AWGN at zero rate for noisy passive feedback, i.e. where 
the channel outputs are fed back without any processing. In 
0, which is closer to our interactive setting, the reliability 
function of the AWGN at zero rate (two messages) with 
noisy active feedback has been considered. Specifically, it 
was shown that active feedback roughly quadruples the error 
exponent relative to passive feedback. The achievability result 
of 0 is better than ours at zero rate. 

II. Preliminaries 

We write log for base 2 logarithm, and In for the natural 

def 

logarithm. We use the vector notation a;” = [xi,..., Xn] and 
boldface letters such as x to indicate vectors of size N. We 
write a„ > to mean lim inf „_,.oo ^ In (j > 0, and 

similarly define < and =. 

A. Lattice Properties 

(i) We denote a lattice of dimension A by A = G • 
where G is the generating matrix. 

(ii) V(A) = I det(G)| is the lattice cell volume. 

(iii) We denote the nearest neighbor quantization of x to the 
lattice A by Qa [tc]- 

(iv) We denote the fundamental Voronoi cell Vq = {x : 

Qa [®] = 0}. 

Hpf 

(v) Modulo A is defined as Ma [tc] = x — Qa [x], 

(vi) Ma [•] satisfies the distributive law : Ma [Ma [tc] + y] = 
Ma [x + y]. 

(vii) The volume to noise ratio (VNR) of a lattice in the pres¬ 
ence of AWGN with variance is y, V^/^(A)/cr^. 

def 

(viii) The normalized second moment of a lattice A is G(A) = 
cr^(A)/V^/^(A), where cr^(A) = A]E(||[/||2) jj 
uniformly distributed on Vq. 

B. Joint Source Channel Coding (JSCC) 

It is well known 191 that when a Gaussian source is 
conveyed over AWGN channel and quadratic distortion mea¬ 
sure, analog transmission obtains the optimal distortion with 
minimal delay. The transmitter merely has to scale the source 
Q to the channel input power constraint, and the receiver 
merely has to multiply by the appropriate Wiener coefficient 
in order to obtain the optimal linear estimate. This solution is 
a simple case of joint source channel coding (JSCC). 

If side information related to the source Q is present at the 
receiver the problem is known as the Wyner-Ziv problem ifTOl . 


Kochman and Zamir im gave the solution of JSCC with side 
information (related to the source and the channel) over an 
AWGN with quadratic distortion measure. They used analog 
transmission as © in conjunction with dithered modulo lattice 
operations that take care of the side information. 

Let us quickly quote survey their result in the case of 
side information only at the source. The source vector to 
be conveyed is Q + J where the destination has J as side 
information. The channel is AWGN with input X noise Z 
and output Y, i.e. Y — X + Z. The transmitter sends: 

X = Ma [/3( j + Q) + F] 

Where V is the dither vector uniformly distributed on Vo the 
basic Voronoi cell of A and commonly known at the trans¬ 
mitter and receiver. The receiver first calculates the temporary 
variable T as follows: 

T = acY - F - = X + Zeq - F - /3J 

where the second transition is pedestrian by the definition of 
the equivalent noise Zeq: 

Z,/=-{l-ac)X + acZ 

The receiver now applies another modulo operation on T 
obtaining U as follows: 

U = Ma [T] = Ma [PQ + Zeq] 

where the second transition is due to the distributive law on 
Ma [•]. Now, if PQ -f Zeq S Vq, then U = PQ + Zeq. We 
show in the sequel that by appropriate parameter settings, 
the probability of the complementary event can be made 
exponentially small with respect to the lattice dimension N. 

In Km a linear estimate of Q: Q = was obtained. 

However, in our scheme ^ naturally cancels out rendering the 
setting as immaterial. We note that setting ac < 1 is common 
practice in many lattice problems, improving performance in 
lower SNRs and making Zeq non-Gaussian (which usually 
improves the etTor probability). For clarity of exposition we 
use in this work only ac = 1 - 

C. The Schalkwijk-Kailath (S-K) Scheme 

The famous S-K scheme© for capacity achieving commu¬ 
nication over AWGN can be interpreted using JSCC tools. 
The classic scheme encodes the message W into a message 
point 0 using single dimensional modulation. At the end of 
the first step, k = 1, Terminal B sets its estimate of 0 to be 
01 = Yi. In consequent steps k. Terminal B feeds back 0^ to 
Terminal A. At step k + 1, Terminal A extracts the estimation 
error = 0^ — 0 and conveys it to Terminal B by JSCC. 
Namely, Terminal A sends Xfc = ak+iSk where is set so 
that to meet the channel input power constraint and Terminal 
B linearly estimates Sk = Pk+iYk+i where Pk+i is set so 
that to minimize the Mean Squared Error (MSE). Having 
Terminal B now advances its estimate by 0fc+i = 0fe+i — Pk- 
Finally, at step K, Terminal B decodes W from Qk- 

For the sake of analysis it is convenient to observe the 
series of channels from 0 to 0^. These are Gaussian channels 
whose noise variance is at fc = 1 , and it is easy to see 
that optimizing over ak and Pk reduces the noise variance by 


1 + SNR at every step. So, at the final step K we have a 
channel whose SNR is SNR(1 -f SNR)^“^. At this step it 
can be shown that mapping W into 0 using PAM and giving 
a Gaussian analysis of the etTor probability, can yield a rate 
arbitrarily close to the channel capacity by taking a sufficiently 
large K. 


D. Error Exponents 

Consider the case where a lattice point X G A is sent over 
m AWGN channel Y = X + Z and the decoder estimates 
X(F) according to an Maximum Likelihood (ML) decoding 
rule. Then there exist lattices whose probability of decoding 
error is exponentially upper bounded by Pr(X(F) 7 ^ X) < 
g-AfEp( 5 ^) yy}jgj-g ^ j-jjg VNR w.r.t the lattice and the 
channel noise variance and Ep{-) is the Poltirev error exponent 
given by ||T2l, |fT3l : 

{ I (a; — 1 — ln(a;)) if 1 < a; < 2 
i (ln(a:) -f ln(|)) if 2 < a; < 4 
|a; if a; > 4 

For channel coding over AWGN with SNR and with rate 
R, there exist block codes of length N whose average error 
probability (averaged over the messages) under ML decoding 
is exponentially upper bounded by Pr(X(F) 7 ^ X) < 
^-NEr-(R) YvJiere i5p(SNR, i?)is given by lfT4l : 

{ EspiSNR, R) if Rrc<R<C 
ErciSNR, R) if Rex < R< Rrc 
EexiSPiR,R) iff)<R<Rex 

The boundaries between the regions are as follows. The 
Shannon capacity is (7 = ^log(l -I- SNR). The critical 
rate is Rcr ^/2 log ^ 1/2 + SNRy '4 _|_ 1/2 y' 1 -f SNR^/ 4 ^. The 

expurgation rate is Rex '= log + SNR^/ 4 ^. 

The error exponent in the sphere packing region is: 


E,piSNR, R) = 


SNR 

'W 


p + I — {P — l)\ 1 + 


4p 


SNR(^ - 1) 




4/3 


SNR(/3 - 1) 


where /? = 2^^. In the random coding region: 


Erc{SNR,R) = l-p + 


SNR 1 


-f - log /3 - 


SNR\ 


- - log(/3) - log(2)R 


where now /? = 2e^^“'’. Lastly, in the expurgation region: 


L;e.(SNR,i?) = 


SNR r 


1 - v'l-2- 


2R 


It is also possible to show, using some pedestrian algebra, 
that for all rates 0 < i? < C, E^piSNR, R) coincides with the 
asymptotic expression of Shannon’s sphere packing bound for 
AWGN lITSl . Hence, it is also an upper bound for the reliability 
function, and thus the bound is tight above the critical rate. 
















III. Setup 

Our setup is defined as follows. The feedforward and 
feedback channels connecting Terminal A to Terminal B and 
vice versa respectively, are AWGN channels given by 




Zn 


Yfi — + Zn- 


Where Xn,Yn (resp. Xn,Yn) are the input and output of 
the feedforward (resp. feedback) channel at time n respec¬ 
tively. The feedforward (resp. feedback) channel noise ^ 
Af{0,a^) (£esp. Zn ^ Af{0,a^)) is independent of the input 
Xn (resp. Xn), and constitutes an i.i.d. sequence. The feedfor¬ 
ward and feedback noise processes are mutually independent. 

Terminal A is in possession of a message W ^ 
Uniform([M]), to be described to Terminal B over N rounds 
of communication. To that end, the terminals can employ an 
interactive scheme defined by a pair of functions {(p, if) as 
follows: At time n. Terminal A sends a function of its message 
W and possibly of past feedback channel outputs over the 
feedforward channel, i.e., 

Xn = ip{W,Y^-^). 


Similarly, Terminal B sends function of its past observations 
to Terminal A over the feedback channel, i.e., 

= ^(y”). 

Remark 1. The dependence of ip and p on n is suppressed. 
In general, we allow these functions to further depend on 
common randomness shared by the terminals. 

We assume that Terminal A (resp. Terminal B) is subject to 
a power constraint P (resp. P), namely 

N N 

Y,mu)<N-p, Y.^{x^n)<N■ p. 

n—l n—1 

We denote the feedforward (resp. feedback) SNR by SNR 
^ (resp. SNR ^). The ratio between the feedback SNR 

and the feedforward SNR is denoted by ASNR We 

implicitly assume that ASNR > 1. 

An interactive scheme (p, p) is associated with a rate P = 
(in bits) and an error probability Pe{N, R), which is the 
probability that Terminal B errs in decoding the message W 
at time N, under the optimal decision rule. We say that an 
error exponent E{R) is achievable if there exists a sequence 
of interactive coding schemes indexed by N with rate at least 
R, such that pe{N,R) < 


IV. Description of the scheme 

In Subsection III-CI we discussed the S-K scheme and 
described its feedforward transmission as a JSCC of the 
estimation error. It was assumed that Terminal A knows 
the estimation error, which is made possible by Terminal 
B sending back its estimate 0^, and Terminal A in turn 
subtracting 0 to obtain = 0fc — 0. This procedure holds 
if the feedback is noiseless and fails if the feedback is noisy. 
In the latter case, it was observed in ||6] that the transmission 
from Terminal B to Terminal A can be regarded as JSCC 
with side information. Namely, Terminal B wished to convey 


Ek to Terminal A, but knowing only 0^ = 0 -|- whereas 
Terminal A knows 0 and can use it as side information. The 
scheme described in El used scalar modulo operations and 
took advantage of the noisy feedback in order to reduce the 
capacity gap, maintaining the simplicity of the original S-K 
scheme. 

The use of scalar modulo operation benefits from simplicity 
and low delay, at the price of modulo error which is bounded 
away from zero. As shown in Subsection III-BI the error 
probability can be made to approach zero using modulo-lattice 
operations in the limit of large dimension. This provides mo¬ 
tivation to the following modifications of our scalar scheme: 

1) Replace the scalar interval lattice with a lattice A of 
dimension N. 

2) Replace the scalar PAM mapping of the message point 
kP —0 with an AWGN block code of the same 
dimension A,namely W ^ 

3) Use block code and lattice error exponents for the 
analysis of the aggregate error probability incurred by 
the associated high-dimensional extension of our scalar 
scheme, where interaction takes place on a block-wise 
basis. 

It should be noted that the feedback operations (i.e. the 
modulo-lattice operations) requires the knowledge of an entire 
vector of length N, and cannot be implemented on the fly. 
Moreover, the modulo-lattice result requires N channel uses 
to be transmitted. To accommodate this inherent delay we 
use two interlaced block-wise schemes. Having two schemes 
each using K rounds requires 2K blocks of length N. For 
simplicity, we use double indexing for the blocks. The block 
index I is represented by a pair of indices {k,j) so that 
I = 2{k — 1) + j. More explicitly, this notation defines 


— [Xi2(k-l)+j-l)N+l, ■ ■ ■ , X(2(k-l)+j-l)N+N] 


We denote the round index hy k G [K] and the scheme index 
by i G {1,2}. The feedforward of round k and scheme i 
is sent over the block pertaining to indices {k,i), and the 
corresponding feedback is sent over the block pertaining to 
indices {k, i + 1). 

Let us now give a description of scheme for i G {1,2}. 
The setting of the parameters a,Pk,Tk will be discussed in 
the sequel. The dither variables V\ are i.i.d. and uniformly 
distributed on Vq. 

(A) Initialization: 

Terminal A: Map the message kP* to codeword 0* 
using a codebook for AWGN with average power P. 
Terminal A ^ Terminal B: 

. Send X\ = 0* 

. Receive Y\ = X\ + Z\ 

Terminal B: Initialize the 0* estimate to 0]^ = Y\. 

(B) Iteration: 

Terminal B Terminal A: 

• Given the 0 estimate 0;., compute and send in the 
following block 




• Receive Y 


1+1 

k 



+ z 


i+l 

k 









Terminal A: Extract a noisy scaled version of estimation Lemma 1. Let Pr denote the probability operator in for the 
error e].\ coupled process. Then for any K > 1; 


= 


yT - - n 


Note that ’ unless a modulo-aliasing 

error occurs. 

Terminal A => Terminal B: 

• Send a scaled version of where a 

is set so that to meet the input power constraint P 
(computed later). 

. Receive 

Terminal B: Update the 0* estimate ®fe+i — ®fe 
where 


n = Pk+iYUi ( 1 ) 

is the MMSE estimate of e^. The optimal selection of 
Pk is described in the sequel. 

(C) Decoding; After the reception of block the receiver 
decodes the message {^k) using an ML decision 
rule w.r.t. the codebook. 


K 


K 


Pr U = Pr U ■ 


\k^l 


\k^l 


Combining the above with ^ and applying the union bound 
in the coupled system, we obtain 


2 K 




i—1 k—1 


Calculating the above probabilities now involves only Gaus¬ 
sian random variable, which significantly simplifies the anal¬ 
ysis. 

Erom this step on, we perform an asymptotic exponential 
analysis. We note that the sums of probabilities are expo¬ 
nentially dominated by the maximal summand, therefore both 
interlaced schemes i G {1,2} are set to be identical, and set 
the parameters such that all modulo-aliasing error probabilities 
are the same. Hence 


Pe S 


< 2 


{K - l)Pr {El) + Pr (E},) = Pr {E}) + Pr [E],) . 


V. Error analysis and Parameter Setting 

As elaborated above, decoding of the two interlaced 
schemes produces VU® iX\) and an error occurs if either of 
the decoded messages is not equal to its corresponding sent 
message VU®. It is important to note that due to the modulo- 
lattice operations in the feedback, the additive noise corrupting 
Y\ is not Gaussian. However, a Gaussian analysis can be used 
to bound the error probability as we show herein. 

Eor any k G (1 ,... ,K — 1} we define £’{, as the event the 
feedback decoding results in modulo-aliasing error, i.e. 


Defining p^od Pr (E}) and pdec Pr (E)^) yields 

Pe — Pmod E Pdec 

We now set the lattice second ^moment to equal the feedback 
power constraint o’^(A) = P (and guarantee that this is 
the feedback transmission power by dithering). The modulo¬ 
aliasing error event is the event where 

7n£fc + ^ Vo (3) 


El = {inel + ZT ^ Vo} . 

We define as the decoding error at the final decoding step 

E®, = {}v®(y},) ^ TV®} 

In order to use the Gaussian analysis we introduce the 
following upper bound for the error probability: 

pl<Pr(^[jEi}j. (2) 

The inequality stems from the fact that a modulo-aliasing error 
does not necessarily cause a decoding error. 

To proceed, we define the coupled system as a system that 
is fed by the same message and experiences the (sample- 
path) exact same noises, with the only difference being that no 
modulo operations are implemented at neither of the terminals. 
Clearly, the coupled system violates the power constraint at 
Terminal B. However, given the message W®, all the random 
variables in the coupled system are jointly Gaussian, and 
in particular, the estimation errors e} in that system are 
Gaussian for k = 1,... ,K. Moreover, it is easy to see that 
the estimation errors are sample-path identical between the 
original system and the coupled system until the first modulo¬ 
aliasing error occurs. To be precise we quote ||6l Lemma 1]: 


By the coupling argument, we can assume for our bounding 
analysis that the LHS above is Gaussian. The looseness L of 
the lattice is defined by the power ratio of the RHS and LHS 
of (O, i.e., 

iM P 

By the definitions in Subsection III-Al L = p(A) • G(A). By 
ifTsl Theorem 5] there exist lattices that asymptotically attain 
both G(A) = and the Poltyrev exponent, so we 

can set p = 2'KeL -\- o(l), then Ep{-^^) = Ep{L) and 

In the next step we send Xl^i = ae}, where a is set so 
that to meet the input power constraint P, i.e. a = LE, 
The channel output in the next round is thus 

^fe+l = 

Setting Pk+i in O to the optimal Wiener coefficient, one can 
easily calculate the evolution of the estimation variance af. 
^E||e}|p. We now observe that the channel from 0® to 0®-|- 
e\ is in fact a vector of independent parallel AWGN channels 
each with a noise variance cr^. Namely, after K rounds, we 









SNR = 20dB, ASNR = 30dB 


pc; 




RjC 


Fig. 1. Error exponents with and without feedback for SNR = 20dB and 
ASNR = 30dB 


have iV instances of independent AWGN channels each with 
SNR given by 


SNRk{L) = SNR • 


^1 + SNR 


1 - LSNR-i 
1 + LASNR'i 


K-l 


We would now like to set i?j.(SNRi<'(L), RK)) — Ep{L) and 
solve for L. Assuming both Er and Ep are in their expurgation 
regions (as shall be verified later) we would like to solve; 

, K 


(sNR - l) 


1 


4 ASNR 


1 


^j^viRK) = -L. 


where r]{RK) = 1 — The solution yields L* = 

SNR/(1 + {^ri{RK)ASNR)i). Plugging it in 0 yields for 
any K > 1: 


Efb{R) > 


SNR ■ ASNR ■ ri{RK) 

IQK (l + (ip(RA:)ASNR)^) 


(l + o(l)) (5) 


At i? = 0 an optimization on K is possible, yielding K*\ 


K* = 0.78 • In (iASNR) a; 0.18 • ASNRdB - 0.54. 

So either (best of) \K*^^ and \_K*\ can be plugged in (|5]) 
giving a bound for Efb- This bound holds as long as this 
K and L* both satisfy the expurgation region assumptions: 
L > 4 and ATi? < i?cr(SNRK(L)). 

For rates outside this region one can simply use 
■^Er{SNKK{L), KR) with L and K found at the highest 
rate in the expurgation region. 
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Therefore, we can now map the message W’’ into 0* using a 
Gaussian codebook of block length N and rate K-RXo obtain 

Pdec < 


Note that the rate K ■ R is chosen such that the overall rate 
(over K rounds) is R. We therefore immediately obtain the 
following. 


Theorem 1. The error probability attained by our suggested 
interactive scheme is upper bounded by Pe < ^ 

where 


Efb{R) 


def 

= max 
iCeN.L>l 


f niin{Er{mKK{L),KR),Ep{L)} 
\ 2K 


} 


(4) 


Note that the division by 2K in due to normalization of the 
error exponents by the actual code length which is 2NK. The 
trade-off is now clear: setting the lattice looseness L to be large 
reduces pmod but also reduces SNRx(T) hence enlarging pdec, 
and vice versa. Due to the monotonicity of Er{SNRK{L), K ■ 
R),Ep{L) in L, a numerical solution to @ can be easily 
found. 


VI. Discussion 

Numerical evaluation of Egp, E^ and Efb for SNR = 
20dB and ASNR = 30dB is depicted in Fig. [T] It is clear 
that in this scenario our scheme improves the error exponents 
for most rates below capacity. 

It is now constructive to give an approximation for high 
SNR, namely SNR ^ 1. It is easy to see that for SNR ^ 1; 

(snr-l)^ 
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